The ‘Update’

Further Analysis
I take new songs produced by two of the artists to use for further analysis.

As can be seen in my storyboard from previous week I took 3 new songs to make comparisons with that have been released after I created my corpus. These songs include:

  • Lake Of Fire as comparison song from D-Block & S-te-Fan.
  • Fallen Souls as base-line song for D-Block & S-te-Fan.
  • Knockout as base-line song for Da Tweekaz.

I took these songs since I found both songs from D-Block & S-te-Fan and the song from Da Tweekaz to be very unique and distinctive songs for both these artists. Previous week I made two Dynamic Time Warp plots to see whether I could find any similarities or differences using that. This week to see how much both songs from D-Block & S-te-fan compare to each other and how the comparison song from D-Block & S-te-fan differs from the song from Da Tweekaz I made cepstrograms and self-similarity matrices for all songs and compare these in the next two stories.

Cepstrograms of the songs


Although the cepstrograms look very nice, it does not give a direct clue that Lake of Fire and Fallen Souls are from the same artist. The only thing that can be noticed is that both songs are less loud than Knockout. Sadly we don’t see any patterns that are both in Lake of Fire and Fallen Souls and not in Knockout. There is however one interesting thing about both the cepstrogram and self-similarity matrices of the three songs, but I will explain that in the Self-Similarity Matrix section.

Self-Similarity Matrix of the songs


Looking at the Self-Similarity matrices we do see very different plots. This, however, is again not what I wanted to see, since I would’ve hoped for shared features between the first two matrices. This was ofcourse to be expected since both the cepstrograms and similarity matrices are made with the same data/features of the three songs.

Since the similarity matrix of Fallen Souls looked so interesting because of it having only one yellow horizontal (and thus vertical) line. When listening to all three songs if found out that the yellow parts in both the cepstrograms and similarity matrices of Lake of Fire and Knockout corresponded to a ‘drop’ or at least and increase in overall music intensity and especially an increase in intensity of the bass frequencies. The interesting bit is that the yellow part in Fallen Souls actually corresponds with the sound of a piano! This piano is supported with a high sound in the background at about the same height resulting in an high magnitude at c02. It is also worth noticing that after this bit there is an increase in magnitude at the c05 level. This is because after both high magnitude bits at c02 there is a part where a female voice sings: “In a place afar from home, there is a haunted house along an empty road, and if you listen close you can hear the whispers of the fallen souls” before going to a lower singing height. The dark blue squares correspond to the sound of only a synth sounding the main melody of the song, explaining the relative small magnitude.

Introduction

Column

The Idea

For my project I’m going to do research in the genre of hardstyle music. A lot of people would say that the music within this genre is all alike. However there is a common assumption that each artist distinguishes him- (or her)self with his (or her) unique style and sound. This are most noticable in the tones used in the so called drop and as bass-kick.

I’m going to research whether this assumption can be proved with a (computer)model. In particular a classification model that can classify a song with an artist (assumed that this song is of one of the artists used for training the model). Because if such model can be used we can assume that indeed there is something in the songs that are unique for each artist. However if such model is not possible, I’m going to research why this is the case, or what is necessary to create such model.

Column

The Corpus

To do this research we obviously need some data to work with. For this I’m going to use the songs from the top 5 hardstyle artists together with the songs of my 2 most favorite artists. Together this gives me a corpus of 698 songs where each artist has about 50 songs or more. This should be enough data to build a decent classification model.

Artist Songs on Spotify
Noisecontrollers 199
Headhunterz 146
Brennan Heart 100
Showtek 88
Da Tweekaz 62
Sub Zero Project 56
D-Block & S-te-Fan 47tt

The Data

Data Understanding
In this section I do a quick exploration of the data. I look if there may be already some patterns.


Data Understanding

Before trying to build a classifier we first need to do some exploration on and understanding of the data. In the first place we need to decide which information we are going to use for the classifier. For example the genre for each artist probably will be similar and thus will not be useful data. Furthermore we have two possible sets of features we can use to train the classifier with:

  • Track Features, these features are returned by the get_track_audio_features() method of the spotifyr package. This method is also used to get the track features in the get_artist_audio_features() and get_album_audio_features() methods. These features are values that say something about the song in a whole, thus we will get 1 feature value per song.
  • Track Analyis, these features are obtained using the get_track_audio_analysis() method fromt the spotifyr package. The analysis features are quite a bit more extensive than the track features, thus will probably contain a lot more information about the song. However this means more data, which will take up more disk space, take longer to obtain from Spotify, make the classifiation training take more time and make the model quite a bit more complex (since we now need to add a time dimension to our model).

Because of the reasons described above I’m first going to focus on creating a model created with the track features. If I fail to create a good model with these features I’m going to take a look at the track analysis features.

The track features include many features including the following numeric features that may be useful:

  • Danceability
  • Energy
  • Key
  • Loudness
  • Mode
  • Speechiness
  • Accousticness
  • Instrumentalness
  • Liveness
  • Valence
  • Tempo

This is a lot of data in which some features may be very similar for all songs. It is useless to include this data in the trainingsdata for the classifier since it wouldn’t provide good information to distinguish two songs from each other, let alone different artists.

To give a good insight in these features and get a quick overview of which of these may be show some clear differences between the artists, I have combined all songs from all artists into one dataset. I have plotted each of them here (I would have made them interactive however due to limitations of the plotly package they are just images. From these plots you can observe that the Mode features is not useful. Some other features don’t seem to show any clear patterns on their own too. That is why I have plotted each feature relative to the other features. If one of these plots already show clusters we probably only need to use these two features to train a classifier with. However I have put these plots in a separate document because of the amount (121) and because no plot seems to show any clear clusters. So although the scatterplots did seem to show some nice patterns, these patterns seem to be very similar among the artists. This thus means we need to perform Principal Component Analysis. I will elaborate more on this in the Data Preparation section. If the PCA provides us with good clusters we know that we can quite easily build a classifier, however if the PCA doesn’t provide us with any noticable clusters it may be possible that the data still can be clustered, but in higher dimensions. This, however, is quite hard to visualize, thus then I will probably just feed the data to the classifier and hopefully it will be able to draw relations between the features and the artist.

Data Preparation (Dimensionality Reduction)
Not all data is useful for a classifier, therefore we need to reduce the dimensionality of our dataset.


Dimensionality Reduction

Before we can feed the data to the classifier we first need to prepare the data. One part of data preparation is data reduction. This means that we reduce the initial dataset to be only data we are going to use for the classifier. Since we are going to use the track features we can all discard all data other than these features. I combined this data only into a new data frame and saved that as my new corpus. Previously I mentioned removing the Mode feature from our data as well, however since we are now going to perform PCA I will keep this feature for now. Of course we still need to include the Artist in our reduced dataset since we need to use that data as classes for the model.

Principal Component Analysis

As mentioned in the Data Understanding we need to apply Principal Component Analysis on the data since the features on their own or relative to one other feature didn’t show any good clusters. Principal component analysis means that we reduce the data to a new dataset where each column is an information rich column that captures as much possible variation from the initial data. This data may be even better to use than the features on their own since the PCA data will be more dense in information, and will contain only the relevant parts of the features. First of all I made a PCA of the data, I have plotted the first two against each other (see Figure 1).

As you can see, there is still no clear separation between the artists, if any we see that they all seem to be very close in terms of PC1 and PC2. However since the PCA consists, in our case, of 11 principal components, we might not see all differences between the artists. Thus we need a way to see whether all principal components can be used to desinguish the artists.

K-Means Clustering

We can use clustering for this and k-means clustering in particular. This clustering algorithm first assigns each data point to a random cluster. Then it iteratively calculates the centerpoint of each cluster and assigns each data point to the cluster with the nearest centerpoint, this is done using some distance formula that handles multidimensional data (like Euclidean distance, Manhattan distance). To compare the standard features with the principal components I have applied k-means clustering on both the features data and principal components data with 7 clusters (we have 7 artists).

If the data can be effectively clustered we would see 7 separate circles, and in each cirlce there will be only one type of points. However, as you can see in Figure 2 and Figure 3 (can be found interactive in the next to stories) most data points are all in one big cluster and the circles of all clusters all cross each other. We do see that using the Principal Components as data for the clustering algorithm did help to separate one cluster from all others. But if we examine this cluster we see that 6 out of the 7 artists have songs in this cluster.

[1] "Noisecontrollers"   "Headhunterz"        "Brennan Heart"     
[4] "Da Tweekaz"         "Sub Zero Project"   "D-Block & S-te-Fan"

Figure 2 (interactive)

Figure 3 (interactive)

Data Preparation (Subsetting)
Before we can train a classifier we need training- and testingdata. To reduce data loss I use cross validation.

(Upcoming) Subsetting the data for the classifier

Since we are going to train a classifier we also need to seperate the data into two subsets:

  • A trainingset, containing about 80% of the data. This data will be used to train the classifier.
  • A testset, containing the remaining 20% of the data. This data will be used to test the classifier.

However, since taking only 20% of the data as validation data, this means we get only 140 songs to validate the classifier. In the best scenario this will mean 20 songs per artist, however since we don’t have an equal amount of songs per artist we will most likely not get 20 songs per artist in the validation data.

For this reason I’m going to perform cross validation on the data with 5 batches. This means that I’m going to shuffle the data and then divide the data into 5 parts (thus each part is 20% of the data). After that I will repeatedly take one part as testset and the other parts as trainingset. This way all data will be once testdata and 4 times trainingsdata, resulting in a model that has seen more data and thus is less overfitted on that data, hopefully giving a model that can classify novel songs better.

Testing on Novel Data

Column

Novel Data
I gather the songs that are released after I made my initial corpus, to use as novel test data.

Novel Data

As mentioned earlier, I want to test my models and techniques on novel data; songs that have been released by one of the artists of my corpus after I made my initial corpus. A few songs have been released already. I have put some of them below:

Artist Track Name
D-Block & S-te-Fan Lake Of Fire
D-Block & S-te-Fan Feel It!
D-Block & S-te-Fan Fallen Souls
Brennan Heart Born & Raised
Da Tweekaz Power Of Perception
Da Tweekaz Knockout
Da Tweekaz We Made It

Song Similarity using Chroma Features
I compare some songs by comparing their chromagrams.

One way to maybe see to whether two songs belong to the same artist would be by using chromagrams. From the list of novel songs we see that both D-Block & S-te-Fan and Da Tweekaz have released the most songs in the past weeks. So I choose to use 2 songs from D-Block & S-te-Fan and one song from Da Tweekaz:

  • Lake Of Fire as comparison song.
  • Fallen Souls as base-line song for D-Block & S-te-Fan.
  • Knockout as base-line song for Da Tweekaz.

Then I created a Dynamic Time Warp plot between the comparison and one base-line song to look if we can see some similarities between both songs from D-Block & S-te-Fan and big difference between de comparison song and the base-line song of Da Tweekaz.

The plots can be found below:

In my opinion the DTW between the two D-Block & S-te-Fan songs looks a lot more symmetrical, which would indicate that both songs both follow a similar pattern. This while in DTW between the D-Block & S-te-Fan song and the Da Tweekaz song both songs are very destinguishable, which would indicate that they are very dissimilar from each other.

We now need to find a way to convert this dissimilarity into a way to link an artist with a song.

Upcoming

All items below are a planning and description of things I’m going to do in the upcoming weeks, these are subject to change as result of feedback and/or new things learned during the course.

Modeling
This is where the actual models are created, and we finally see what they are capable of!

As mentioned before I’m going to train multiple models, trained on different subsets of the initial dataset:

  • A model trained on all features mentioned in the Data Understanding without the Mode feature.
  • A model trained on (a subset of) the Principal Components.
  • (Optionally) A model trained on (a subset of) the get_track_audio_analysis() features, if necessary.

Depending on the results of these models I may deduce what is causing them to perform a certain way, and build new models that may perform better.

Evaluation
Here I evaluate the performance of each model and explore the reason of their performance.

For each model we can assess it’s performance by the amount of songs it classified correctly. We can use an extensive confusion matrix to compare the amount of correctly classified songs to the amount of incorrect classifications to see which songs are related to each other. After all, if the songs for a certain artist that were incorrectly classified almost always were classified with a certain other artist, those artists must be very similar to each other in terms of the features used for the model.

Optimally we want to create a model that uses only the features returned by the get_track_audio_features() method, without diving into the sounds and tune of a song (the data returned by the get_track_audio_analysis() method). I’m going to compare the different models with each other and find which model performed best and why. If two artists turn out to be very similar in each model, I may look for the most similar tracks and subjectively compare them by for example listening to them to see whether those tracks are indeed similar to the ear as well.

For the future I hope that some of the artists used to build the model will release new songs, thus allowing me to test the model on novel data to see whether the model is indeed as good as it proved to be on the testdata.

IDEE: lake of fire vergelijken met een random song van elke andere artiest maar ff kijken hoe en wat